AITopics | parallel translation

A Theory of Unsupervised Translation Motivated by Understanding Animal Communication

Neural Information Processing SystemsDec-26-2025, 02:50:38 GMT

Neural networks are capable of translating between languages--in some cases even between two languages where there is little or no access to parallel translations, in what is known as Unsupervised Machine Translation (UMT). Given this progress, it is intriguing to ask whether machine learning tools can ultimately enable understanding animal communication, particularly that of highly intelligentanimals. We propose a theoretical framework for analyzing UMT when no parallel translations are available and when it cannot be assumed that the source and target corpora address related subject domains or posses similar linguistic structure. Weexemplify this theory with two stylized models of language, for which our framework provides bounds on necessary sample complexity; the bounds are formally proven and experimentally verified on synthetic data. These bounds show that the error rates are inversely related to the language complexity and amount of common ground. This suggests that unsupervised translation of animal communication may be feasible if the communication system is sufficiently complex.

animal communication, name change, unsupervised translation motivated, (3 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Riemannian approach to batch normalization

Minhyung Cho, Jaehyung Lee

Neural Information Processing SystemsNov-21-2025, 07:31:17 GMT

This ambiguity in the optimization process can be removed by interpreting the space of weight vectors as a Riemannian manifold on which all the scaled versions of a weight vector correspond to a single point on the manifold.

algorithm, artificial intelligence, machine learning, (21 more...)

Neural Information Processing Systems

Country:

North America > Canada > Ontario > Toronto (0.14)
North America > United States > California > Los Angeles County > Long Beach (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

Language steering in latent space to mitigate unintended code-switching

Goncharov, Andrey, Kondusov, Nikolai, Zaytsev, Alexey

arXiv.org Artificial IntelligenceOct-17-2025

Multilingual Large Language Models (LLMs) often exhibit unintended code-switching, reducing reliability in downstream tasks. We propose latent-space language steering, a lightweight inference-time method that identifies language directions via PCA on parallel translations and steers token embeddings along these axes to control language identity. Our approach mitigates code-switching while preserving semantics with negligible computational overhead and requires only minimal parallel data for calibration. Empirically, we achieve 95-99\% language classification accuracy using a single principal component and reduce next-token distributional divergence by up to 42% across multiple language pairs on Qwen2.5 and Llama-3.2 models. We further analyze the layer-wise evolution of language representations, revealing that language identity concentrates in final layers with near-perfect linear separability.

computational linguistic, large language model, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2510.13849

Country:

Asia > Thailand (0.15)
Europe > France (0.14)
Asia > Middle East > Qatar (0.14)

Genre: Research Report (0.66)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

A Theory of Unsupervised Translation Motivated by Understanding Animal Communication

Neural Information Processing SystemsJan-19-2025, 07:41:55 GMT

Neural networks are capable of translating between languages--in some cases even between two languages where there is little or no access to parallel translations, in what is known as Unsupervised Machine Translation (UMT). Given this progress, it is intriguing to ask whether machine learning tools can ultimately enable understanding animal communication, particularly that of highly intelligentanimals. We propose a theoretical framework for analyzing UMT when no parallel translations are available and when it cannot be assumed that the source and target corpora address related subject domains or posses similar linguistic structure. Weexemplify this theory with two stylized models of language, for which our framework provides bounds on necessary sample complexity; the bounds are formally proven and experimentally verified on synthetic data. These bounds show that the error rates are inversely related to the language complexity and amount of common ground.

animal communication, parallel translation, unsupervised translation motivated

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Riemannian approach to batch normalization

Minhyung Cho, Jaehyung Lee

Neural Information Processing SystemsOct-3-2024, 05:21:48 GMT

Batch Normalization (BN) has proven to be an effective algorithm for deep neural network training by normalizing the input to each neuron and reducing the internal covariate shift. The space of weight vectors in the BN layer can be naturally interpreted as a Riemannian manifold, which is invariant to linear scaling of weights. Following the intrinsic geometry of this manifold provides a new learning rule that is more efficient and easier to analyze. We also propose intuitive and effective gradient clipping and regularization methods for the proposed algorithm by utilizing the geometry of the manifold. The resulting algorithm consistently outperforms the original BN on various types of network architectures and datasets.

algorithm, manifold, vector, (17 more...)

Neural Information Processing Systems

Country:

North America > Canada > Ontario > Toronto (0.14)
North America > United States > California > Los Angeles County > Long Beach (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.89)

Add feedback

Can AI help to increase access to all languages?

#artificialintelligenceSep-30-2022, 15:14:58 GMT

Languages are the main medium of communication but there are more than 7,100 languages spoken around the world. People who live in different parts of the world speak different languages and it's sometimes hard to communicate with people who don't speak our language. This hinders relationships between people and makes it hard to understand one another or build trust. The ability to translate language, then, makes it easier to communicate across borders, and make information more accessible. With the advances in technology and artificial intelligence, online translators such as Google Translate, DeepL, and Bing Translate have made communication a lot easier among those speaking different languages.

machine translation, translation, translator, (16 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.42)

Add feedback

Riemannian approach to batch normalization

Cho, Minhyung, Lee, Jaehyung

Neural Information Processing SystemsDec-31-2017

Batch Normalization (BN) has proven to be an effective algorithm for deep neural network training by normalizing the input to each neuron and reducing the internal covariate shift. The space of weight vectors in the BN layer can be naturally interpreted as a Riemannian manifold, which is invariant to linear scaling of weights. Following the intrinsic geometry of this manifold provides a new learning rule that is more efficient and easier to analyze. We also propose intuitive and effective gradient clipping and regularization methods for the proposed algorithm by utilizing the geometry of the manifold. The resulting algorithm consistently outperforms the original BN on various types of network architectures and datasets.

algorithm, artificial intelligence, machine learning, (20 more...)

Neural Information Processing Systems

Country: North America > Canada > Ontario > Toronto (0.14)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.89)

Add feedback

The Riemannian Geometry of Deep Generative Models

Shao, Hang, Kumar, Abhishek, Fletcher, P. Thomas

arXiv.org Machine LearningNov-21-2017

Deep generative models learn a mapping from a low dimensional latent space to a high-dimensional data space. Under certain regularity conditions, these models parameterize nonlinear manifolds in the data space. In this paper, we investigate the Riemannian geometry of these generated manifolds. First, we develop efficient algorithms for computing geodesic curves, which provide an intrinsic notion of distance between points on the manifold. Second, we develop an algorithm for parallel translation of a tangent vector along a path on the manifold. We show how parallel translation can be used to generate analogies, i.e., to transport a change in one data point into a semantically similar change of another data point. Our experiments on real image data show that the manifolds learned by deep generative models, while nonlinear, are surprisingly close to zero curvature. The practical implication is that linear paths in the latent space closely approximate geodesics on the generated manifold. However, further investigation into this phenomenon is warranted, to identify if there are other architectures or datasets where curvature plays a more prominent role. We believe that exploring the Riemannian geometry of deep generative models, using the tools developed in this paper, will be an important step in understanding the high-dimensional, nonlinear spaces these models learn.

machine learning, manifold, natural language, (17 more...)

arXiv.org Machine Learning

1711.08014

Country:

North America > United States > Utah > Salt Lake County > Salt Lake City (0.04)
North America > United States > California > San Diego County > San Diego (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (1.00)

Add feedback

Riemannian stochastic variance reduced gradient

Sato, Hiroyuki, Kasai, Hiroyuki, Mishra, Bamdev

arXiv.org Machine LearningApr-10-2017

Stochastic variance reduction algorithms have recently become popular for minimizing the average of a large but finite number of loss functions. In this paper, we propose a novel Riemannian extension of the Euclidean stochastic variance reduced gradient algorithm (R-SVRG) to a manifold search space. The key challenges of averaging, adding, and subtracting multiple gradients are addressed with retraction and vector transport. We present a global convergence analysis of the proposed algorithm with a decay step size and a local convergence rate analysis under a fixed step size under some natural assumptions. The proposed algorithm is applied to problems on the Grassmann manifold, such as principal component analysis, low-rank matrix completion, and computation of the Karcher mean of subspaces, and outperforms the standard Riemannian stochastic gradient descent algorithm in each case.

artificial intelligence, machine learning, r-svrg, (18 more...)

arXiv.org Machine Learning

1702.05594

Country: